Distributed Similarity and Plagiarism Search

نویسنده

  • Máté Pataki
چکیده

This paper describes the di erent approaches of plagiarism search, the methods used by the KOPI Online Plagiarism Search and Information Portal and, shows a distributed approach for building a plagiarism search system. This architecture adds scalability to the system, by allowing placing an arbitrary number of identical components into it. To reduce network tra c and enable secure transfer of the documents between the portal and the document servers a new method of communication is introduced.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Fingerprint-based Similarity Search and its Applications

This paper introduces a new technology and tools from the field of text-based information retrieval. The authors have developed – a fingerprint-based method for a highly efficient near similarity search, and – an application of this method to identify plagiarized passages in large document collections. The contribution of our work is twofold. Firstly, it is a search technology that enables a ne...

متن کامل

External Plagiarism Detection based on Human Behaviors in Producing Paraphrases of Sentences in English and Persian Languages

With the advent of the internet and easy access to digital libraries, plagiarism has become a major issue. Applying search engines is one of the plagiarism detection techniques that converts plagiarism patterns to search queries. Generating suitable queries is the heart of this technique and existing methods suffer from lack of producing accurate queries, Precision and Speed of retrieved result...

متن کامل

Empowering Plagiarism Detection with a Web Services Enabled Collaborative Network

This paper explains how collaborative efforts in terms of technology and content, can help improve plagiarism detection and prevention. It presents a web service oriented architecture, which utilizes the collective strength of various search engines, context matching algorithms and indexing contributed by users. The proposed framework is an open source tool, yet it is extremely efficient and ef...

متن کامل

A new approach for searching translated plagiarism

Detecting plagiarism and similarity between documents written in the same language can be done with high precision with today’s top search systems; there are both free e.g. Plagiarisma (2012), Copyscape (2012) and commercial ones available to use e.g. PlagAware (2012), turnitin (2012). With the spread of foreign language knowledge and the growing number of international students, a new form of ...

متن کامل

Unsupervised Ranking for Plagiarism Source Retrieval Notebook for PAN at CLEF 2013

The source retrieval task for plagiarism detection involves the use of a search engine to retrieve candidate sources of plagiarism for a suspicious document and provides a way to efficiently identify candidate documents so that more accurate comparisons can take place. We describe a strategy for source retrieval that makes use of an unsupervised ranking method to rank the results returned by a ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006